Gladwish et al. Radiation Oncology 201 1, 6:121 
http://www.ro-journal.eom/content/6/1/121 



M RADIATION 
ONCOLOGY 



RESEARCH Open Access 



Evaluation of early imaging response criteria in 
glioblastoma multiforme 

Adam Gladwish 1,2 * 1 ", Eng-Siew Koh 3,4t , Jeremy Hoisak 2,6 , Gina Lockwood 7 , Barbara-Ann Millar 2,7 , Warren Mason 1,6 , 
Eugene Yu 8 , Normand J Laperriere 2,7 and Cynthia Menard 2,5 



Abstract 

Background: Early and accurate prediction of response to cancer treatment through imaging criteria is particularly 
important in rapidly progressive malignancies such as Glioblastoma Multiforme (GBM). We sought to assess the 
predictive value of structural imaging response criteria one month after concurrent chemotherapy and 
radiotherapy (RT) in patients with GBM. 

Methods: Thirty patients were enrolled from 2005 to 2007 (median follow-up 22 months). Tumor volumes were 
delineated at the boundary of abnormal contrast enhancement on T1 -weighted images prior to and 1 month after 
RT. Clinical Progression [CP] occurred when clinical and/or radiological events led to a change in chemotherapy 
management. Early Radiologic Progression [ERP] was defined as the qualitative interpretation of radiological 
progression one month post-RT. Patients with ERP were determined pseudoprogressors if clinically stable for >6 
months. Receiver-operator characteristics were calculated for RECIST and MacDonald criteria, along with alternative 
thresholds against 1 year CP-free survival and 2 year overall survival (OS). 

Results: 13 patients (52%) were found to have ERP, of whom 5 (38.5%) were pseudoprogressors. Patients with ERP 
had a lower median OS (1 1.2 mo) than those without (not reached) (p < 0.001). True progressors fared worse than 
pseudoprogressors (median survival 7.2 mo vs. 19.0 mo, p < 0.001). Volume thresholds performed slightly better 
compared to area and diameter thresholds in ROC analysis. Responses of > 25% in volume or > 15% in area were 
most predictive of OS. 

Conclusions: We show that while a subjective interpretation of early radiological progression from baseline is 
generally associated with poor outcome, true progressors cannot be distinguished from pseudoprogressors. In 
contrast, the magnitude of early imaging volumetric response may be a predictive and quantitative metric of 
favorable outcome. 
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Background 

In 1990, MacDonald et al [1] reported criteria for 
response assessment in glioma. Importantly, these criteria 
incorporated features such as time factors, degree of 
response of contrast-enhancing tumor using computed- 
tomography (CT)-based uni-dimensional World Health 
Organization (WHO) criteria [2], neurologic status and 
the use of corticosteroids. Although these criteria have 
become widely accepted, they have also been criticized 
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for their limitations [3-5], including their inability to 
accurately assess complex tumor morphology, account 
for non-tumor factors that may cause contrast enhance- 
ment, reaction to local therapies [6], and lack of applic- 
ability to non-enhancing tumors. Furthermore, the 
phenomenon of 'pseudoprogression' observed in patients 
receiving concurrent chemo-radiotherapy [7-9], as well 
as the dilemma of 'pseudo-response' seen with some of 
the newer anti-angiogenic therapies [5,10], adds to the 
already complex challenge of early assessment as these 
phenomena can confound image interpretations. 

The accurate and early prediction of response and/or 
progression remains important for several reasons. In 
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principle, this may enable more objective evaluation and 
comparison of novel therapies [5]. Secondly, such a bio- 
marker could be utilized as a surrogate endpoint in clin- 
ical trials, thus conferring the distinct advantage of 
earlier response prediction and greater opportunity to 
amend or institute alternate therapies, especially given 
the aggressive nature of Glioblastoma Multiforme 
(GBM). Thirdly, earlier imaging predictors could poten- 
tially allow the conduct of smaller clinical trials requir- 
ing fewer patients, enable earlier judgements about 
promising versus futile therapies, more expeditious reg- 
ulatory approval for new drugs, and ultimately earlier 
application and translation of new therapies into clinical 
practice [11,12]. In reality however, the evidence for reli- 
able imaging response thresholds that could ultimately 
influence therapeutic decision making is still lacking. 
Currently, response criteria are largely based on the 
response evaluation criteria in solid tumors (RECIST) 
guidelines [13,14], which were developed to standardize 
reporting of outcomes of clinical trials. Most recently, 
the Response Assessment in Neuro-Oncology (RANO) 
working group provided updated criteria for high-grade 
gliomas [15], but as of yet there is not analysis of these 
criteria as they relate to clinical endpoints such as over- 
all survival and progression-free survival. 

We embarked on a study investigating early structural 
and functional magnetic resonance imaging (MRI) eva- 
luations of response in patients with GBM. As a first 
step, we sought to investigate the predictive value of 
standard structural imaging response criteria one month 
after the delivery of concurrent chemotherapy and 
radiotherapy (RT). We also undertook exploratory ana- 
lysis of alternate structural imaging response thresholds 
that may better correlate with and/or predict for clinical 
outcomes. 

Methods 

This study was approved by the institutional research 
ethics board. Patients were prospectively enrolled over a 
26 month interval between May 2005 and July 2007. 
Patients were approached for enrollment if they met the 
following criteria: histological diagnosis of WHO grade 
IV Glioblastoma Multiforme; planned to receive defini- 
tive concurrent chemotherapy (temozolomide 75 mg/m 2 
daily) and RT (60Gy in 30 fractions over 6 weeks) fol- 
lowed by adjuvant temozolomide chemotherapy (200 
mg/m 2 x 5 days, monthly for 1 year or until progres- 
sion); age >18 years; and ECOG performance status 0 or 
1. Patients were excluded if they had contraindications 
to MRI, severe claustrophobia, or previous cranial radio- 
therapy. Relevant clinical and demographic information, 
including gender, age, diagnosis date, disease multi- 
focality, surgical status, and radiation treatment dates 
were also captured. 



MRI acquisition was performed at the following time- 
points: Baseline (BL) post-operatively but prior to radio- 
therapy (RT); week 3 and week 6 of RT, 1 month after 
completion of RT, then every two months until evidence 
of clinical progression (defined below) or until 1 year of 
follow-up. All images were acquired using a 1.5 T GE 
Signa Excite scanner (GE Healthcare, Waukesha, WI, 
USA). The MRI acquisition protocol was performed as 
follows: Axial post-contrast axial Tl -weighted fast-spin 
echo (FSE) (TE = 20 ms, TR = 416.66 ms, FA = 90°, 
BW = 122.109, slice thickness = 5 mm, slice spacing = 7 
mm, 0.859 x 0.859 x 7 mm resolution). 

Clinical and imaging end-points included: A) Time to 
Clinical Progression [CP] - interval between beginning 
of RT and CP defined as aggregate of clinical and radi- 
ological progression resulting in a change in patient 
management (for example, second-line chemotherapy, 
salvage surgery or palliative care); B) Overall Survival 
[OS] - defined as the interval between beginning of RT 
and death; C) Early Radiological Progression [ERP] - 
qualitative impression of any radiological progression 
from baseline to one month post-RT as defined by a 
radiation oncologist (CM), and D) Pseudoprogression - 
when ERP was present but the patient showed clinically 
stable disease for at least 6 months post-RT without a 
change in the adjuvant chemotherapy regimen. 

Post-contrast axial Tl-weighted FSE images were 
rigidly co-registered (mutual information algorithm) 
with the RT planning CT datasets using a commercial 
radiotherapy treatment planning system (Pinnacle 3 v7.6c 
and 8.1, Philips Radiation Oncology Systems, Madison, 
WI). A radiation oncologist (ESK, NL) delineated tumor 
volumes on the Tl-weighted post-contrast MR images 
as defined by areas of abnormal contrast enhancement 
reflecting residual or recurrent tumor, whilst excluding 
areas of post-surgical change. All volumes were then 
reviewed and finalized by a diagnostic radiologist (EY). 

Both longest diameter (axial, coronal, and sagittal 
planes) and 3D volumetric data (cc) were computed at 
baseline (BL) and one-month post RT. Progression was 
then assessed via RECIST criteria, a 20% increase in the 
longest tumor diameter or a 40% increase in volume 
(sums of diameters or volumes were used in the case of 
multi-focal disease). Disease response as determined by 
RECIST was defined as a 65% decrease in volume or a 
30% decrease in diameter. The MacDonald criteria were 
also evaluated: progressive disease defined as a 25% 
increase in the largest tumor area (cm 2 ) and responsive 
disease defined as a 30% decrease in largest area. Each 
patient was then classified in a binary fashion, as either 
having progressive or responsive disease based on these 
imaging thresholds. In addition, the following range of 
volume, area and diameter progression/response thresh- 
olds (see Additional File 1 - Table 1) were investigated 
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including: Diameter - any increase; any increase or 
decrease up to > 5%, 15% or 30%; Area - any increase, 
any increase or decrease > 5%; 15% or 30%; and Volume 
- any increase, > 25% increase, any increase or decrease 
> 10%; 25%; or 50%. 

Sensitivity and specificity values were calculated for 
each threshold using clinical progression-free survival at 
1 year and overall survival at 2 years. Receiver-operator 
curves (ROC) were also constructed and statistical ana- 
lysis was performed on the basis of work by DeLong et. 
al. [16]. Kaplan-Meier survival curves were created to 
analyze early progression, pseudoprogression and clinical 
progression as previously defined. 

Results 

A total of 30 patients were prospectively recruited. One 
patient refused study procedures after enrollment and 
another 4 patients did not undergo MRI examination 
one month after RT, leaving a total of 25 patients from 
whom imaging data was analyzed. It should be noted 
that demographical and follow-up data was taken from 
all 29 patients followed, however only the demographics 
of the 25 patients analyzed in this study are reported 
here. The median age of patients enrolled was 56 years 
(15 men, 10 women, range 46 - 68 years). Five patients 
presented with multifocal disease. Tumor volumes at 
baseline ranged from 0.96 cm 3 to 143.2 cm 3 . The major- 
ity of patients were enrolled after gross total resection (n 
= 14), while 8 and 3 patients underwent partial resection 
and biopsy only, respectively. 

The study cohort had a median follow-up of 26.3 
months (range 13.3 - 37.7 months). Median survival was 
high at 26.7 months and median time to clinical pro- 
gression was 7.5 months (range 1.5 mo. - 35.9 mo.). 

A qualitative impression of any radiological progres- 
sion (ERP) from baseline was found in 11 patients 
(40.0%), although only 2 patients strictly met the Mac- 
Donald criteria for progression at 1 month. Median sur- 
vival for patients with ERP was significantly shorter than 
those without (11.2 mo vs. not reached, p < 0.001) (Fig- 
ure 1). Of those with ERP, five were subsequently deter- 
mined to have pseudoprogression (45.5% of ERP). 
Pseudoprogressors fared better than true early progres- 
sors, with a median survival of 19.0 months vs. 7.2 
months (p < 0.001), (Figure 2) 

Sensitivity and specificity values were calculated for each 
response threshold, along with the positive and negative 
likelihood ratios (+LH; -LH) and the area-under-the-curve 
(AUC) for volume, area and diameter metrics (see Addi- 
tional files 1, 2, 3 - Table 1, 2 and 3 respectively) in pre- 
dicting for 2-year overall survival. The most sensitive tests 
were those measuring response, namely greater than 25% 
and 50% decreases in volume and 15% and 30% decreases 
in area and diameter. The most specific tests were those 



( N 



1 






0.9 


i H . 


p< 0.001 


0.8 
0.7 
0.6 


1 
s 
1 
1 
1 

"1 




> 


1 

1 

l ~. 
I_ 




Survi 




0.3 
0.2 
0.1 


1 
1 

~l 

~l 


No ERP 

ERP 


0 

t 


5 10 15 20 25 

Time (months) 


30 35 40 



Figure 1 Overall survival according to 1 month radiological 

progression status: Overall, survival based on any early radiological 

progression (ERP), observed one month after RT. 
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with the highest thresholds for progression, namely the 
RECIST criteria for both volume and diameter, and Mac- 
Donald criteria for area. In general, the volume measure- 
ments consistently performed better in every category 
than did the area and diameter metrics. This trend can 
also be visualized in Figure 3, receiver-operator curves 
plotting sensitivity vs. 1 -specificity for the volume, area 
and diameter thresholds against overall survival at 2 years. 
The respective AUCs are 0.83 (0.59 - 0.94 95% CI), 0.76 
(0.53 - 0.90 95% CI) and 0.69 (0.44 - 0.84 95% CI) for 
volume, area and diameter respectively. These values were 
significantly different from chance (AUC of 0.5) for both 
volume and area (p < 0.005 and p < 0.05, respectively) but 
not for diameter (p > 0.1). When comparing amongst 
AUCs there was no significant difference between volume, 
area or diameter, with the greatest trend seen between 
volume and diameter (p > 0.1). The two most prognostic 
thresholds were > 15% decrease in area (3.33 +LH, 0.22 
-LH) and > 25% decrease in volume (3.38 +LH, 0.21 -LH). 
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Figure 2 Overall survival according to true vs. pseudo- 
progression status: Overall survival, based on true vs. pseudo 
progression at one month. 
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Figure 3 Receiver-Operator Curve by Dimension Metric: 

Receiver-operator curves for volume (solid, square), area (dashed, 
cross) and diameter (dashed, diamond) thresholds in predicting 2 
year overall survival. Line of indecision is marked as a dotted line. 
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Figure 5 Kaplan-Meier survival according to 25% Volume 
Response at 1 month: Kaplan-Meier survival curve for patients 
with and without a > 25% response in tumour volume, one month 
after RT. 



Figure 4 compares the receiver-operator characteristics of 
volume thresholds when predicting for progression-free 
survival at 1 year and overall survival at 2 years, demon- 
strating a trend that volume metrics to be more predictive 
of overall survival at 2 years than PFS at 1 year (AUC 0.83 
vs. 0.70, p < 0.2). Figure 5 depicts Kaplan-Meier survival 
based on > 25% volume response at 1-month post RT 
nearing statistical significance (median survival 14.9 mo 
vs. not reached, p < 0.06). 

Discussion 

The early and accurate prediction of response to cancer 
treatment through the application of imaging criteria 
has several potential advantages. Ideally, imaging 
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Figure 4 Receiver-Operator Curve of Volume Metrics by 
Clinical End-point: Receiver-operator curves for volume thresholds 
in predicting for 2 year overall survival (solid, square) and 1 year 
clinical progression-free survival (dashed, diamond). Line of 
indecision is marked as a dotted line. 
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thresholds would provide utility as surrogates for out- 
come over and above the more traditional measures 
including overall and progression free survival [17], 
allowing for more expeditious conduct of clinical trials 
(both phase II [18] and III). This in turn could lead to 
the earlier institution of alternate therapies that show a 
beneficial effect on outcome. This is particularly impor- 
tant in dealing with aggressive and rapidly growing 
malignancies such as GBM. 

Our results show that across all thresholds, both pro- 
gressive and responsive, volume was uniformly more 
predictive of OS and PFS as seen by the right shift of 
the diameter ROC curve in Figure 3 (AUC of 0.83 vs. 
0.76 vs. 0.69). However this was only a trend, not 
achieving significance amongst the three, the closest 
being volume vs. diameter (p > 0.15). This is similar to 
what Shah et al and Galanis et al have reported as cor- 
relations between uni and multi-dimensional radiologi- 
cal data in classifying progressive disease [19,20]. 

Furthermore, we show that a qualitative interpretation 
of any radiological progression one-month post therapy 
is associated with poor outcomes. However, this assess- 
ment is not acted upon clinically because of the con- 
founding potential for treatment effect (or 
pseudoprogression), and our current inability (clinically 
and radiologically) to distinguish the two groups apriori. 
Many recent investigations have looked at the incidence 
and outcomes related to pseudoprogression [21-24]. 
Two Canadian studies by Roldan et al and Sanghera et 
al found rates of pseudoprogression of 40% and 32% 
respectively, and median survivals of 9.1 months and 
31.2 months [22,23]. Another recent study by Gerstner 
et al found the pseudoprogression rate to be 57% with a 
median survival of 24.4 months, however their definition 
of pseuodprogression was at 3 months post-chemoRT 
[24], compared to 6 months in this study (and the two 
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referenced previously). All three showed no significant 
difference in OS between those with pseudoprogression 
and those without ERP. The results from this study 
were in keeping with other literature, including a rate of 
pseudoprogression of 38.5% and a median survival of 
19.0 months. There was also no survival benefit between 
pseudoprogressors and those patients with no ERP, 
however pseudoprgressors showed improved OS com- 
pared with true early progressors (median survival 19.0 
mo vs. 7.2 mo, p < 0.01), in keeping with the results of 
Roldan and Sanghera [22,23]. This demonstrates that 
there is sufficient qualitative information in early struc- 
tural imaging to help guide clinicians in identifying pro- 
gressive vs. responsive disease, with the exception of 
pseudoprogression, a topic which is now finding its way 
into the realm of imaging response criteria. 

Historically, quantitative imaging criteria was first 
addressed in 1979 by the WHO in their published 
guidelines [2]. Since then, RECIST vl.O [13] was pub- 
lished in 2000 with subsequent revised criteria (version 
1.1) in 2009 [14]. Each was developed in an attempt to 
standardize reporting and facilitate comparison of ima- 
ging response assessment within the context of clinical 
oncology trials [4,11], however the results of this study 
show that the ability to assess progressive disease via 
quantitative radiological data remains limited. We found 
that each of the MacDonald, RECIST and additional 
thresholds, both uni and multi-dimensional, while speci- 
fic for progressive disease were highly insensitive. This 
translated into a poor correlation with both PFS at one 
year and OS at two years (Figure 4), therefore limiting 
their usefulness as endpoint surrogates in clinical trials. 
One obvious contributor to this effect is the issue of 
pseudoprogression, in that pseudoprogressors will 
always negatively impact the accuracy of progressive 
thresholds based on standard structural imaging. Recent 
updates in response assessment criteria by the RANO 
group (Response Assessment in Neuro-Oncology) have 
included an effort to address these challenges by devel- 
oping guidelines specific to the management of brain 
tumors including parameters for disease progression 
[15]. They suggest deferring the determination of pro- 
gressive disease until > 12 weeks after the completion of 
RT, except in the case of a new lesion outside of the 
radiation field and/or pathology proven progressive dis- 
ease within the original tumor site. This recommenda- 
tion aims to defer a change in clinical management until 
pseuodprogression can be more reliably ruled out. How- 
ever, as was mentioned previously the OS between pseu- 
doprogressors identified at one month after RT is not 
significantly different from non-progressors, and there- 
fore if these patients could be identified more readily, 
the truly progressive patients would avoid an additional 
8 weeks of ineffective chemotherapy. 



In contrast, metrics for defining responsive disease 
performed much better in terms of both PFS and OS 
(Figure 4), likely in part because identifying responders 
is not marred by the issue of pseudoprogression and 
also because intuitively, those with large reductions in 
tumor burden will do better than those without. Clinical 
trials showing evidence of radiological response in GBM 
are therefore likely to have an increased clinical rele- 
vance in terms of survival endpoints, than those focus- 
ing on progressive characteristics. This is contrary to 
the findings of Galanis et al who found that progressive 
disease to be more predictive of OS. This difference is 
probably multi-factorial, for one a variety of gliomas 
were included as compared to solely GBM as in this 
study. Secondly, the there was a smaller portion of 
responders in the Galanis study, likely owing in part to 
the addition of temozolomide to the treatment regiment 
in this study. Finally, the timing of the imaging was later 
in the Galanis study, 4 months post-induction of therapy 
as compared to one month post-RT in our study. This 
difference in timing may decrease the incidence of pseu- 
doprogressors as a fraction may have already declared 
themselves as true early progressors by that point, 
thereby alleviating their negative statistical impact on 
the progressive imaging thresholds. If true, it is concei- 
vable that optimizing the timing of post-therapy follow- 
up imaging could aid in of identification of pseudopro- 
gressors. Our study only looked at a single imaging time 
point, however further investigation into multiple ima- 
ging time points would certainly be insightful. It is unli- 
kely however that the answer to this challenging issue 
lies in timing along, and as such an array of research 
continues to look for potentially more robust and quan- 
tifiable solutions. Many groups have looked at the use of 
functional imaging modalities to augment standard ana- 
tomical information. The addition of perfusion and dif- 
fusion-weighted techniques are thought to be able to 
provide information about tumor activity as a potential 
biomarker of tumor progression [25]. As such, the role 
of functional MRI (diffusion-weighted and perfusion) is 
the subject of intense clinical investigation [26-33], and 
recent findings have shown that diffusion-weighted ima- 
ging can predict for OS and time-to-progression in high 
grade glioma [29,30]. Furthermore, recent results by 
Tsien et. al. have shown promise in using dynamic sus- 
ceptibility contrast magnetic resonace imaging (DSC- 
MRI) and parametric response maps measuring relative 
cerebral blood volume to identify pseudoprogression 
from true progression during therapy [34]. The role of 
FLT-PET and molecular imaging is also being actively 
investigated as a potential modality for imaging tumor 
progression [35,36]. 

A primary limitation of our study lies in a relatively 
small sample size of prospectively recruited Glioblastoma 
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patients. Our work must be further validated in a larger 
cohort for meaningful interpretation and future clinical 
translation. Furthermore, as was mentioned above, our 
study only investigated a single imaging time point (one 
month post-RT), additional imaging would be useful 
determining if there is an optimal time point, and what 
that might be. Our study cohort had a significantly higher 
median survival (26.2 mo. 95% CI 13.7 - not reached) 
than expected from the literature (14.6 mo. 95% CI 13.2 - 
16.8 [37]). Finally, baseline imaging in the study was per- 
formed post-operatively, where resolving post-surgical 
changes may have been a potential confounding factor in 
the assessment of response. Strengths of this cohort 
include a typical and balanced population demographic 
in age, gender and size. Extent of surgery was also 
balanced with -50% undergoing gross total resection and 
the remainder having either partial total resection or 
biopsy alone. The extended length of follow-up (median 
22 months) was also beneficial to this study. 

Conclusion 

We sought to evaluate early radiologic response criteria 
relevant to clinical outcomes in patients with GBM treated 
with concurrent chemotherapy and radiotherapy, and 
found that a qualitative clinical impression of radiologic 
progression at one month after therapy was predictive of 
poor outcomes despite the confounding factor of treatment 
effect (pseudoprogression). Quantitatively, we found that 
response metrics were more indicative of outcome than 
progressive indices and that there was a trend of volu- 
metric data outperforming diameter or area thresholds, 
however significance was not reached in this case. Further 
investigation will focus on adding additional imaging time 
points as well as adjunct functional imaging to better 
understand progression features that may have a stronger 
predictive value than structural geometric indices alone. 
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